1,280 research outputs found
GRAPHENE: A Precise Biomedical Literature Retrieval Engine with Graph Augmented Deep Learning and External Knowledge Empowerment
Effective biomedical literature retrieval (BLR) plays a central role in
precision medicine informatics. In this paper, we propose GRAPHENE, which is a
deep learning based framework for precise BLR. GRAPHENE consists of three main
different modules 1) graph-augmented document representation learning; 2) query
expansion and representation learning and 3) learning to rank biomedical
articles. The graph-augmented document representation learning module
constructs a document-concept graph containing biomedical concept nodes and
document nodes so that global biomedical related concept from external
knowledge source can be captured, which is further connected to a BiLSTM so
both local and global topics can be explored. Query expansion and
representation learning module expands the query with abbreviations and
different names, and then builds a CNN-based model to convolve the expanded
query and obtain a vector representation for each query. Learning to rank
minimizes a ranking loss between biomedical articles with the query to learn
the retrieval function. Experimental results on applying our system to TREC
Precision Medicine track data are provided to demonstrate its effectiveness.Comment: CIKM 201
Recommended from our members
Improved Combination of Multiple Atmospheric GCM Ensembles for Seasonal Prediction
An improved Bayesian optimal weighting scheme is developed and used to combine six atmospheric general circulation model (GCM) seasonal hindcast ensembles. The approach is based on the prior belief that the forecast probabilities of tercile-category precipitation and near-surface temperature are equal to the climatological ones. The six GCMs are integrated over the 1950–97 period with observed monthly SST prescribed at the lower boundary, with 9–24 ensemble members. The weights of the individual models are determined by maximizing the log likelihood of the combination by season over the integration period. A key ingredient of the scheme is the climatological equal-odds forecast, which is included as one of the "models" in the multimodel combination. Simulation skill is quantified in terms of the cross-validated ranked probability skill score (RPSS) for the three-category probabilistic hindcasts. The individual GCM ensembles, simple poolings of three and six models, and the optimally combined multimodel ensemble are compared. The Bayesian optimal weighting scheme outperforms the pooled ensemble, which in turn outperforms the individual models. In the extratropics, its main benefit is to bring much of the large area of negative-precipitation RPSS values up to near-zero values. The skill of the optimal combination is almost always increased (in the large spatial averages considered) when the number of models in the combination is increased from three to six, regardless of which models are included in the three-model combination. Improvements are made to the original Bayesian scheme of Rajagopalan et al. by reducing the dimensionality of the numerical optimization, averaging across data subsamples, and including spatial smoothing of the likelihood function. These modifications are shown to yield increases in cross-validated RPSS skills. The revised scheme appears to be better suited to combining larger sets of models, and, in the future, it should be possible to include statistical models into the weighted ensemble without fundamental difficulty
Learning to Rank Question Answer Pairs with Holographic Dual LSTM Architecture
We describe a new deep learning architecture for learning to rank question
answer pairs. Our approach extends the long short-term memory (LSTM) network
with holographic composition to model the relationship between question and
answer representations. As opposed to the neural tensor layer that has been
adopted recently, the holographic composition provides the benefits of scalable
and rich representational learning approach without incurring huge parameter
costs. Overall, we present Holographic Dual LSTM (HD-LSTM), a unified
architecture for both deep sentence modeling and semantic matching.
Essentially, our model is trained end-to-end whereby the parameters of the LSTM
are optimized in a way that best explains the correlation between question and
answer representations. In addition, our proposed deep learning architecture
requires no extensive feature engineering. Via extensive experiments, we show
that HD-LSTM outperforms many other neural architectures on two popular
benchmark QA datasets. Empirical studies confirm the effectiveness of
holographic composition over the neural tensor layer.Comment: SIGIR 2017 Full Pape
Gene expression studies for the analysis of domoic acid production in the marine diatom Pseudo-nitzschia multiseries
Background:
Pseudo-nitzschia multiseries Hasle (Hasle) (Ps-n) is distinctive among the ecologically important marine diatoms because it produces the neurotoxin domoic acid. Although the biology of Ps-n has been investigated intensely, the characterization of the genes and biochemical pathways leading to domoic acid biosynthesis has been limited. To identify transcripts whose levels correlate with domoic acid production, we analyzed Ps-n under conditions of high and low domoic acid production by cDNA microarray technology and reverse-transcription quantitative PCR (RT-qPCR) methods. Our goals included identifying and validating robust reference genes for Ps-n RNA expression analysis under these conditions.
Results:
Through microarray analysis of exponential- and stationary-phase cultures with low and high domoic acid production, respectively, we identified candidate reference genes whose transcripts did not vary across conditions. We tested eleven potential reference genes for stability using RT-qPCR and GeNorm analyses. Our results indicated that transcripts encoding JmjC, dynein, and histone H3 proteins were the most suitable for normalization of expression data under conditions of silicon-limitation, in late-exponential through stationary phase. The microarray studies identified a number of genes that were up- and down-regulated under toxin-producing conditions. RT-qPCR analysis, using the validated controls, confirmed the up-regulation of transcripts predicted to encode a cycloisomerase, an SLC6 transporter, phosphoenolpyruvate carboxykinase, glutamate dehydrogenase, a small heat shock protein, and an aldo-keto reductase, as well as the down-regulation of a transcript encoding a fucoxanthin-chlorophyll a-c binding protein, under these conditions.
Conclusion:
Our results provide a strong basis for further studies of RNA expression levels in Ps-n, which will contribute to our understanding of genes involved in the production and release of domoic acid, an important neurotoxin that affects human health as well as ecosystem function.Plymouth State University Graduate Programs OfficeWoods Hole Oceanographic Institution Academic Programs OfficeNew Hampshire IDeA Network of Biological Research Excellence (NH-INBRE)National Center for Research Resources (U.S.) (Grant 5P20RR030360-03)National Institute of General Medical Sciences (U.S.) (Grant 8P20GM103506-03
The regulation of miRNAs by reconstituted high-density lipoproteins in diabetes-impaired angiogenesis
Diabetic vascular complications are associated with impaired ischaemia-driven angiogenesis. We recently found that reconstituted high-density lipoproteins (rHDL) rescue diabetes-impaired angiogenesis. microRNAs (miRNAs) regulate angiogenesis and are transported within HDL to sites of injury/repair. The role of miRNAs in the rescue of diabetes-impaired angiogenesis by rHDL is unknown. Using a miRNA array, we found that rHDL inhibits hsa-miR-181c-5p expression in vitro and using a hsa-miR-181c-5p mimic and antimiR identify a novel anti-angiogenic role for miR-181c-5p. miRNA expression was tracked over time post-hindlimb ischaemic induction in diabetic mice. Early post-ischaemia when angiogenesis is important, rHDL suppressed hindlimb mmu-miR-181c-5p. mmu-miR-181c-5p was not detected in the plasma or within HDL, suggesting rHDL specifically targets mmu-miR-181c-5p at the ischaemic site. Three known angiogenic miRNAs (mmu-miR-223-3p, mmu-miR-27b-3p, mmu-miR-92a-3p) were elevated in the HDL fraction of diabetic rHDL-infused mice early post-ischaemia. This was accompanied by a decrease in plasma levels. Only mmu-miR-223-3p levels were elevated in the hindlimb 3 days post-ischaemia, indicating that rHDL regulates mmu-miR-223-3p in a time-dependent and site-specific manner. The early regulation of miRNAs, particularly miR-181c-5p, may underpin the rescue of diabetes-impaired angiogenesis by rHDL and has implications for the treatment of diabetes-related vascular complications
A protein interaction atlas for the nuclear receptors: properties and quality of a hub-based dimerisation network
BACKGROUND: The nuclear receptors are a large family of eukaryotic transcription factors that constitute major pharmacological targets. They exert their combinatorial control through homotypic heterodimerisation. Elucidation of this dimerisation network is vital in order to understand the complex dynamics and potential cross-talk involved. RESULTS: Phylogeny, protein-protein interactions, protein-DNA interactions and gene expression data have been integrated to provide a comprehensive and up-to-date description of the topology and properties of the nuclear receptor interaction network in humans. We discriminate between DNA-binding and non-DNA-binding dimers, and provide a comprehensive interaction map, that identifies potential cross-talk between the various pathways of nuclear receptors. CONCLUSION: We infer that the topology of this network is hub-based, and much more connected than previously thought. The hub-based topology of the network and the wide tissue expression pattern of NRs create a highly competitive environment for the common heterodimerising partners. Furthermore, a significant number of negative feedback loops is present, with the hub protein SHP [NR0B2] playing a major role. We also compare the evolution, topology and properties of the nuclear receptor network with the hub-based dimerisation network of the bHLH transcription factors in order to identify both unique themes and ubiquitous properties in gene regulation. In terms of methodology, we conclude that such a comprehensive picture can only be assembled by semi-automated text-mining, manual curation and integration of data from various sources
Simulation of growth and development of diverse legume species in APSIM
This paper describes the physiological basis and validation of a generic legume model as it applies to 4 species: chickpea (Cicer arietinum L.), mungbean (Vigna radiata (L.) Wilczek), peanut (Arachis hypogaeaL.), and lucerne (Medicago sativa L.). For each species, the key physiological parameters were derived from the literature and our own experimentation. The model was tested on an independent set of experiments, predominantly from the tropics and subtropics of Australia, varying in cultivar, sowing date, water regime (irrigated or dryland), row spacing, and plant population density. The model is an attempt to simulate crop growth and development with satisfactory comprehensiveness, without the necessity of defining a large number of parameters. A generic approach was adopted in recognition of the common underlying physiology and simulation approaches for many legume species. Simulation of grain yield explained 77, 81, and 70% of the variance (RMSD = 31, 98, and 46 g/m2) for mungbean (n = 40, observed mean = 123 g/m2), peanut (n = 30, 421 g/m2), and chickpea (n = 31, 196 g/m2), respectively. Biomass at maturity was simulated less accurately, explaining 64, 76, and 71% of the variance (RMSD = 134, 236, and 125 g/m2) for mungbean, peanut, and chickpea, respectively. RMSD for biomass in lucerne (n = 24) was 85 g/m2 with an R2 of 0.55. Simulation accuracy is similar to that achieved by single-crop models and suggests that the generic approach offers promise for simulating diverse legume species without loss of accuracy or physiological rigour
Contrasting the Group 6 metal-metal bonding in sodium dichromate(II) and sodium dimolybdate(II) polymethyl complexes : synthetic, x-ray crystallographic and theoretical studies
Extending the class of group 6 metal-metal bonded methylate compounds supported by alkali metal counter-ions, the first sodium octamethylmolybdate(II) complex [(TMEDA)Na]4Mo2Me8 and heptamethylchromate(II) relations [(donor)Na]3Cr2Me7 (donor is TMEDA or TMCDA) are reported. The former was made by treating [(Et2O)Li]4Mo2Me8 with four equivalents of NaOtBu/TMEDA in ether; whereas the latter resulted from introducing TMEDA or TMCDA to ether solutions of octamethyldichromate [(Et2O)Na]4Cr2Me8. X-ray crystallography revealed [(TMEDA)Na]4Mo2Me8 is dimeric with square pyramidal Mo centres [including a short Mo–Mo interaction of 2.1403(3) Å] each with four methyl groups in a mutually eclipsed conformation. In dinuclear [(TMCDA)Na]3Cr2Me7 trigonal bi-pyramidal Cr centres each bond to three terminal methyl groups and one common Me bridge, that produces a strikingly short Cr–Cr contact of 1.9136(4) Å. Broken symmetry density functional theoretical calculations expose the multiconfigurational metal-metal bonding in these compounds with a Mo–Mo bond order of 3 computed for octamethylmolybdate(II). This is contrasted by the single Cr–Cr bond in heptamethylchromate(II) where the singlet ground state is derived by strong antiferromagnetic coupling between adjacent metal ions
- …